Learning to classify documents according to genre
نویسندگان
چکیده
منابع مشابه
Learning to classify documents according to genre
Current document retrieval tools succeed in locating large numbers of documents relevant to a given query. While search results may be relevant according to the topic of the documents, it is more difficult to identify which of the relevant documents are most suitable for a particular user. Automatic genre analysis that is, the ability to distinguish documents according to style would be a usefu...
متن کاملLearning to Classify Medical Documents According to Formal and Informal Style
This paper discusses an important issue in computational linguistics: classifying sets of medical documents into formal or informal style. This might be important for patient safety. Formal documents are more likely to have been published by medical authorities; therefore, the patients could trust them more than they can trust informal documents. We used machine learning techniques in order to ...
متن کاملLearning to Classify Text from Labeled and Unlabeled Documents
In many important text classification problems, acquiring class labels for training documents is costly, while gathering large quantities of unlabeled data is cheap. This paper shows that the accuracy of text classifiers trained with a small number of labeled documents can be improved by augmenting this small training set with a large pool of unlabeled documents. We present a theoretical argume...
متن کاملUsing linked data to classify web documents
Purpose – To find a relationship between traditional faceted classification schemes and semantic web document annotators, particularly in the linked data environment. Design/methodology/approach – A consideration of the conceptual ideas behind faceted classification and linked data architecture is made. Analysis on selected web documents is performed using Calais’ Semantic Proxy to support the ...
متن کاملLearning to Classify Questions
An automatic classifier of questions in terms of their expected answer type is a desirable component of many question-answering systems. It eliminates the manual labour and the lack of portability of classifying them semi-automatically. We explore the performance of several learning algorithms (SVM, neural networks, boosting) based on two purely lexical feature sets on a dataset of almost 2000 ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the American Society for Information Science and Technology
سال: 2006
ISSN: 1532-2882,1532-2890
DOI: 10.1002/asi.20427